Modeling Sensorimotor Behavior through Modular Inverse Reinforcement Learning with Discount Factors
نویسندگان
چکیده
منابع مشابه
Inverse Reinforcement Learning through Policy Gradient Minimization
Inverse Reinforcement Learning (IRL) deals with the problem of recovering the reward function optimized by an expert given a set of demonstrations of the expert’s policy. Most IRL algorithms need to repeatedly compute the optimal policy for different reward functions. This paper proposes a new IRL approach that allows to recover the reward function without the need of solving any “direct” RL pr...
متن کاملInverse Reinforcement Learning through Structured Classification
This paper adresses the inverse reinforcement learning (IRL) problem, that is inferring a reward for which a demonstrated expert behavior is optimal. We introduce a new algorithm, SCIRL, whose principle is to use the so-called feature expectation of the expert as the parameterization of the score function of a multiclass classifier. This approach produces a reward function for which the expert ...
متن کاملRobust Bayesian Inverse Reinforcement Learning with Sparse Behavior Noise
Inverse reinforcement learning (IRL) aims to recover the reward function underlying a Markov Decision Process from behaviors of experts in support of decision-making. Most recent work on IRL assumes the same level of trustworthiness of all expert behaviors, and frames IRL as a process of seeking reward function that makes those behaviors appear (near)optimal. However, it is common in reality th...
متن کاملGrounding the Meanings in Sensorimotor Behavior using Reinforcement Learning
The recent outburst of interest in cognitive developmental robotics is fueled by the ambition to propose ecologically plausible mechanisms of how, among other things, a learning agent/robot could ground linguistic meanings in its sensorimotor behavior. Along this stream, we propose a model that allows the simulated iCub robot to learn the meanings of actions (point, touch, and push) oriented to...
متن کاملInverse Reinforcement Learning with PI2
We present an algorithm that recovers an unknown cost function from expert-demonstrated trajectories in continuous space. We assume that the cost function is a weighted linear combination of features, and we are able to learn weights that result in a cost function under which the expert demonstrated trajectories are optimal. Unlike previous approaches [1], [2], our algorithm does not require re...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Vision
سال: 2017
ISSN: 1534-7362
DOI: 10.1167/17.10.1267